OMNI: A Framework for Integrating Hardware and Software Optimizations for Sparse CNNs
نویسندگان
چکیده
Convolution neural networks (CNNs) as one of today's main flavor deep learning techniques dominate in various image recognition tasks. As the model size modern CNNs continues to grow, network compression have been proposed prune redundant neurons and synapses. However, prior disconnect software hardware acceleration, which fail balance multiple design parameters, including sparsity, performance, area cost, efficiency. More concretely, unstructured pruning achieve high sparsity at expense extra performance overhead, while structured relying on strict sparse patterns lead low cost. In this article, we propose OMNI, a framework for accelerating accelerators. The innovation OMNI stems from that it uses amenable on-chip memory partition seamlessly engage CNN acceleration. To accelerate compute-intensive convolution kernel, promising optimization approach is partition, divides original weight kernels into several groups so different processing elements can simultaneously access weight. We exploit block, cyclic, or hybrid means patterns. Our balances across our accelerator employs parallelization coordinately with patterns, leading desirable compromise between performance. further develop models help designers quickly identify pattern factors subject an constraint. Last, evaluate application specific integrated circuit (ASIC) field-programmable gate array (FPGA) platform. Experiments demonstrate achieves 3.4×- 6.2× speedup CNNs, over comparably ideal dense accelerator. shows 114.7× energy efficiency improvement compared GPU also evaluated Xilinx ZC706 ZCU102 FPGA platforms, achieving 41.5 GOP/s 125.3 GOP/s, respectively.
منابع مشابه
a framework for identifying and prioritizing factors affecting customers’ online shopping behavior in iran
the purpose of this study is identifying effective factors which make customers shop online in iran and investigating the importance of discovered factors in online customers’ decision. in the identifying phase, to discover the factors affecting online shopping behavior of customers in iran, the derived reference model summarizing antecedents of online shopping proposed by change et al. was us...
15 صفحه اولA Unified Energy Estimation Framework with Integrated Hardware-Software Optimizations
With the emergence of a plethora of embedded and portable applications, energy dissipation has joined throughput, VLSI layout area, and accuracy/precision as a major design constraint. Thus, designers must be concerned with both optimizing and estimating the energy consumption of circuits, architectures, and software. Most of the research in energy optimization and/or estimation has focused on ...
متن کاملpassivity in waiting for godot and endgame: a psychoanalytic reading
this study intends to investigate samuel beckett’s waiting for godot and endgame under the lacanian psychoanalysis. it begins by explaining the most important concepts of lacanian psychoanalysis. the beckettian characters are studied regarding their state of unconscious, and not the state of consciousness as is common in most beckett studies. according to lacan, language plays the sole role in ...
Hardware/software Techniques for Memory Power Optimizations in Embedded Processors
Power has become one of the primary design constraints in modern microprocessors. This is all the more true in the embedded domain where designers are being pushed to create faster processors that operate for long periods of time on a single battery. It is well known that the memory sub-system is responsible for a significant percentage of the overall power dissipation. For example, in the Stro...
متن کاملHardware Optimizations for Crypto Implementations
Latency, Area, and Power are three important metrics that a VLSI designer wants to optimize. However, often one of these may have to be optimized at the cost of another or the other two. Depending on the application scenario, choice of the metric to optimize is made. In this paper, we consider hardware implementations of a number of cryptographic primitives and present a number of optimizations...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
سال: 2021
ISSN: ['1937-4151', '0278-0070']
DOI: https://doi.org/10.1109/tcad.2020.3023903